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due to the speedy advancement of Artificial Intelligence (AI). Nowadays, a lot of 
applications, including phone unlocking systems, criminal identification systems, and even 


home security systems, use face recognition as a common technique. Due to the fact that 
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this method only requires a facial image instead of other dependencies like a key or card, it 
is more secure. Face detection and face identification are often the first two elements of a 
human recognition system. Even during COVID-19, it is considered the best way to stop 
the spread of the COVID-19 virus is by wearing a face mask. The risk of contracting the 
virus can be reduced by almost 70% only by wearing a face mask. In order to promote 
community health. This Study aims to produce a highly precise and real-time method that 
can effectively recognize people and identify non-mask faces in public. When a person 
stands in front of the device, this application detects the human face automatically using 
detection, extraction, and recognition algorithms. The proposed work applies the Viola- 
Jones algorithm for face recognition and the YOLOv5 algorithm for mask detection and 
classification. When the proposed work is tested, this shows higher accuracy in mask 
detection which is 92.8%. 


Introduction developing a similar computational model for this 


Face Recognition nowadays is one of the most 
important biometric research fields for many purposes. 
Security is one of the major concerns in all potential 
fields. Face recognition includes the detection of the face, 
its position, image pre-processing etc. First, we must 
understand that face detection and face recognition are 
two distinct things, although both depend on one another. 
Face detection is the technique by which the system 
identifies faces in digital images & videotape streaming 
irrespective of the source while face recognition is the 
relating of a given face with a given name in digital 
images. Face recognition is a core component of the 
human sensory system and a common human activity, 


becomes challenging. The primary benefit of the facial 
recognition system is that it can be used without human 
intervention and at a distance to identify the face (Howell 
et al., 2022). In high-security locations like banks, 
government buildings, and public places like malls, 
parks, railway stations, and bus stops, this kind of 
technique is necessary. 

There are three main steps come under face 
recognition system such as face detection, feature 
extraction, and face matching. The identified and reused 
face is matched with a database of recognized faces to 
identify the person (Kortli et al., 2020). The most 
common approaches for facial identification are feature 
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analysis, neural networks, Eigen faces, and autonomous 
face processing. Although facial recognition technology 
has improved tremendously, improvements are still 
needed to demonstrate accuracy and dependability. 

The Viola-Jones object recognition and detection 
framework was presented by Michael Jones and Paul 
Viola in 2001. The inspiration behind the development of 
this framework was the difficulty of face detection using 
traditional techniques. All though this technique may also 
be used to recognize and detect various objects. This 
technique is powerful and gives optimal recall and 
precision value. For feature selection, it makes use of 
AdaBoost classifier and Hear-like features. 

From year 2019 to 2022, the world has been overrun 
by Coved 19 virus. According to numerous studies, 
COVID-19 spreads mostly through social interaction with 
infected people and through droplets and aerosol 
transmission. Therefore, several methods, including 
quarantine and lockdown are recommended by the 
majority of governments to prevent the increase of 
COVID-19 infection. Thus to stop the spread of COVID- 
19, it has been determined that encouraging people to 
correctly wear protective face masks is an urgent and 
effective option. 

Face mask detection is the process of checking if a 
person is wearing a mask or not. In order to train efficient 
classifiers for detection and recognition, using hand- 
crafted features and traditional machine learning 
techniques, the initial study on face detection was carried 
out in 2001 (Campbell et al., 2022). This approach has 
problems with a complex feature design and subpar 
detection accuracy. There are many algorithms used for 
face detection which include SVM, PCA, KNN, CNN 
and many other than come YOLO which outperform 
these algorithms. YOLO directly enhances and optimizes 
detection performance by training on whole photos. This 
integrated model has various advantages over customary 
item identification methods. First and foremost, YOLO 
operates quickly and selects representations of adaptable 
things. When tested on creative images and trained on 
real-world images, YOLO greatly surpass detection 
methods like KNN and SVM. 


Background Details 
Face Recognition System 

Identification of faces helps analyze a person's face 
traits in order to recognize them. In security, and 
management systems, it has grown in popularity. 
Researchers first looked at the possibility of employing 
computers that could recognize human faces in the early 
1960s, which is when the technology first gained 
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popularity. Simple methods like the alignment of 
templates, which compared an unknown face's traits to a 
predefined set of prototypes, were utilized by early facial 
recognition systems. These early systems had drawbacks 
because of their reliance on present templates and 
inability to adapt to changes in stance, lighting, and face 
expression. As time passed on, experts in this area 
created elaborate facial recognition algorithms using 
artificial intelligence and machine learning methods. Face 
recognition technology is now used for an extensive 
spectrum of tasks, from unlocking cell phones to 
detecting suspects in crime. As consequently, research is 
always being done to enhance the reliability, equity, and 
openness of face recognition systems. 
Face Mask Detection 

The method of figuring out whether someone is 
wearing a mask or not is known as face mask detection. 
The implementation of facial mask recognition 
technology has been encouraged by the requirement to 
impose public health regulations related to the COVID- 
19 pandemic. But there are several restrictions and issues 
with the technology. One of the main issues is that the 
effectiveness of face mask recognition system can be 
impacted by elements like position, and obstacles, that 
may render it hard for the system to differentiate between 
a face shield and other facial items, regardless of these 
difficulties, facial mask recognition technology is going 
to continue to play a significant role in initiatives aimed 
at improving public health, with researchers constantly 
search for ways to boost its precision and consistency 
(Pebrianto et al., 2022). 
Viola Jones Algorithm 

It is a face recognition algorithm. Paul Viola and 
Michael Jones proposed this in 2001. For a real-time face 
recognition application, this system can be executed 
effectively because it is very precise and efficient. It has 
an accuracy rate of 95%. Figure 1 shows the process 
involved in Viola Jones algorithm. 
YOLOv5 

It is an object detection algorithm which is released in 
2020 and is considered one of the most accurate 
algorithms. It uses CNN to perform object detection. As 
YOLOv5 is a single-stage object detector. Speed is 
considered one of the key features of it. The CSP- 
Darknet53 serves as the foundation for YOLOv5. Its 
accuracy is 95.76%. The process of YOLOVS is shown in 
Figure 2. 
Literature Review 

Table 1 comprises the literature review on the basis of 
various research papers published over the years on the 
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algorithms related to face recognition and mask detection 


system. 


Frame 
Processing 
Frame 


Face/Face 
Not 


Start 


Import Yolov5S 


Train the model 


Table 1. Literature review 


Haar Features 


Cascade Classifier 


Figure 2. Process of Viola Jones 


Integral Image 


Ada Boost 
Machine Learning 


Configure GPU 
Environment in 
Google Colab 


Import custom 
dataset 


Test the model 


Figure 1. Process of YOLOv5 


Set up pytorch 
environment 


Import the pre 
trained weights 


Author(s) Algorithms used Application work with Limitation/Drawback 
Accuracy (A) 

1. Leo et al., 2018 SVM Expression-Invariant 3D | 1. It does not perform well with the 
Face Recognition large dataset. 

System A-94.39% 2. Performs poorly with noisy 
dataset. 

2, Peng et al., 2021 Principal In this paper PCA is It is poorly written. It only uses the 

Component being used in face linear combination of the features of 
Analysis (PCA) recognition data. 

A-90% 

3. Suyal et al., 2022 K- Nearest KNN is used for object 1. High memory is needed since all 

Neighbor (KNN) detection. of the training data must be stored. 
2. Given that it stores all of the 
training data, it is computationally 
expensive. 

4. Hussain et al., 2022 | Deep CNN, This paper is based on 1. A lot of training data is needed. 

MobileNetV2 MobileNet technology 2. Training process takes a long 

and Deep CNN time. 

A -97% 3. MobileNetV2 uses the deep 
separable convolution that may 
affect the training efficiency. 

5. Jiang et al., 2021 YOLOv3 Real-Time Face Mask Large localization error and lower 
Detection Method Based _ | recall 
on YOLOv3 
A- 75% 

6. Indulkar et al., 2021 | YOLOv4 This paper discussed Large localization error and lower 
social distancing and recall but 10% better performance 
mask detection using than YOLOv3. 

YOLOv4 during COVID 

19 
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Shatnawi et al., 2018 


Eigen faces 


It shows how the 
measurement of different 
face distance impact 
Eigenvalues. 


1. Its sensitivity to head position and 
lighting conditions in an image is a 
disadvantage. 

2. Finding the eigenvectors and 
eigenvalues on PPC takes a lot of 
time. 


Upendra et al., 2021 


Opencv 


Opencv is used for real 
time face recognition 
A - 98% 


1. The facial recognition software.is 
particularly sensitive to position 
changes. 

2. Different angles or head motions 
can change the texture of a person's 
face, resulting in an incorrect image. 


Kaur et al., 2017 


Viola Jones 


It uses performance- 
based analysis based on 
Viola-Jones. 


1.Only allows binary categorization 
2. When facing the target, it works 
best. 

3. Extremely high or low exposure 
for brightness makes the model 
sensitive. 


10. 


Pranav et al., 2020 


CNN 


This paper uses CNN for 
evaluation of faces in 
real time A-97.25% 


The position and orientation of 
objects are not encoded by the 
model. 


11. 


Solomon et al., 2019 


Image Processing 


The proposed work 
mainly focuses on the 
challenges faced in the 
face recognition system. 


1. Ageing affects the recognition of 
face. 

2.Variation in facial expression may 
reduce the efficiency of the model. 


12. 


Gupta et al., 2023 


AlexNet, and 
Neural Network 
Sensor 


This paper is based on 
the COVID-19 pandemic 
where mask becomes 
compulsory and to detect 
it various Machine 
Learning techniques is 
used. 

A - 98.39% 


The usage of big convolution 
filtering increases the complexity of 
the model. 


13. 


Nagoriya et al., 2020 


Raspberry pi, and 
ResNet 


Live Facemask Detection 
System. 
A- 96.02% 


It uses the deep complex network 
that takes weeks to train the model, 
making it practically inefficient for 
solving real-life problems. 


14. 


Horvat et al., 2022 


YOLO, and 
YOLOv5 


Localization and 
classification of image 
and its comparative study 
using YOLOv5 are 
discussed. 


It has trouble identifying tiny images 
among a collection of photographs. 


15 


Chaudhari et al., 
2018 


Viola Jones 


Viola Jones and Neural 
Network being discussed 
in this paper for face 
detection 


It is sensitive to high or low image 
brightness. 
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Materials and Methodology 
Details of Images Dataset 
This paper 
dimensions and angles. In acustomized dataset adding 
real time images during execution. They all are in jpg 


consists of pictures with various 


format with a system space of 289 MB. Further in this 
paper, face images and videos for face mask detection are 
being collected from various sources like Kaggle to test 
and train the system, containing 1400 images of which 
1167 are training images and 233 are testing images and 
371 videos which 250 are training images and 121 are 
testing images as given in Table 2. They all are in jpg 
format with a system space of 530 MB (Deepak, 2021). 


Table 2. Total images in the dataset 


Dataset Training Images __ Testing Images 
images 1167 233 
videos 250 121 


Table 3 shows the dataset distribution in different 
dimension. It is in pixel format. The maximum width and 
minimum width of the image is 608 and 194 respectively. 
The maximum height and minimum height of the of the 
image dataset is 612 and 193 respectively. 


Table 3. Shows the dataset Distribution 


Image Dataset 


Maximum height 612 
Maximum width 608 
Minimum height 193 
Minimum width 194 


Sample Images 
1. Face Recognition 


Figure 3. Sample image for Face recognition from 
customized dataset 
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2. Face Mask Detection 


Figure 4. Shows the sample images of people wearing 
mask and without masks (Deepak, 2021) 


Proposed System 

The process of establishing requirements such asa 
system's architecture, components and modules, 
interfaces, and datais known as systems design. Face 
recognition technology helps recognise human faces in 
digital images and video frames (Bansal et al., 2022). 
Object detection technology seeks out occurrences of 
objects in digital photos and movies. The proposed 
system is depicted in the Figure 5. The system has three 
stages: Face detection, Feature Extraction, Face 
Matching. 
Face Detection and Acquisition of Face Data Stage 

In Figure 5, the face recognition system begins by 
detecting faces in a provided image. The focus of this 
step is to identify whether the incoming image has human 
faces or not (Kortli et al., 2020). Face detection may be 
difficult owing to variations in illumination and face 
expression. Pre-processing procedures are utilized to aid 
in the development of a more definitive face recognition 
system. The Viola-Jones is used for detecting and 
locating the human facial image. Furthermore, using 
YOLO techniques, the face detection stage can be 
utilized to classify films and images (Natarajan et al., 


2022). 


Feature Extraction 

Figure 5 shows the second level of the Facial 
Recognition Framework. This step's primary purpose is 
to isolate the face features from the face images 
discovered during the detection stage. A face is 


represented in this level by a "signature," which is a 
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collection of characteristics vectors that reflect prominent 
facial features like the mouth, nose, and eyes with their 


Pre-Trained Model 


between the sums of pixel intensities in two adjacent 
rectangular parts of an image. Haar-like characteristics 


Pre-Trained Model of 
Enrolled User 


| 


Images Face Detection 


Face 


Feature Extraction : se 
Matching/Recognition 


Figure 5. It shows the framework for Real-time Face Recognition in image systems 


geometric dispersion. The size, shape, and structure of 
each face distinguishes it. 
Face Recognition 

This is the third level of the Face Recognition 
Framework (Figure 5). The backdrop features extracted 


throughout the feature extraction procedure are 
considered and compared to well-known faces stored in a 
given database at this stage. Identification and 


verification are the two main functions of facial 
recognition. In the identification process, a face is then 
compared to other faces to decide which one is the most 
likely match (BattaL et al., 2022). A face is then 
compared to a trained face in the database throughout the 
identification process to determine whether it should be 
approved or rejected. 


Algorithm used for Face Recognition System 
Viola Jones 

The Viola-Jones technique, devised in 2001 by 
Michael Jones and Paul Viola, is an object-recognition 
system that detects visual elements in real time. The 
Viola-Jones Algorithm has two stages: Training, 
Detection. Training happens before detection, but we will 
talk about detection first for clarity's sake. Viola-Jones 
identifies frontal faces better than sides, above, or 
downwards faces since it was developed for them. Before 
detecting a face, the image is changed to grayscale since 
it is more convenient to work with and consumes fewer 
processing resources After detecting the person's face in 
the grayscale image, the Viola-Jones method detects its 
location on the coloured image. The four fundamental 
ideas that underpin the Viola-Jones algorithm are 
discussed below: i) Haar-like characteristics, ii) Integral 
pictures to accelerate feature calculation iii) AdaBoost 
learning to choose features iv) Classifier cascade for 
quick rejection of windows without faces. 
Haar-like features 

Haar-like wavelets are picture features that are used in 
the Viola-Jones object detection system. These are 
simple rectangular features computed as the difference 
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are used as input to this classifier, with the concept that 
particular patterns of Haar-like features are more likely to 
be present in positive cases than in negative examples. 
Each feature defines a group of rectangles in the image 
window. A rectangle can be either white or black in 
colour. The feature value is derived by subtracting the 
sum of white pixel values from the total of black pixel 
values. The feature value will be around zero for “flat 
regions”, i.e., where all the pixels have the same value. A 
large feature value will be obtained in the regions where 
the pixels in the black and white rectangles are very 
different. The classifier then evaluates each window to 
determine whether it contains the object of interest or 
not. Overall, the use of Haar-like features is a key aspect 
of the Viola-Jones framework, allowing for efficient and 
accurate object detection in a broad range of applications. 
As shown in Figure 6, features A and B have great 
importance in face detection as: The area around the eyes 
is darker than the area around the cheeks. The eye area is 
darker as compared to the nose area. 
Integral Images for accelerating feature computation 
Integral images are a major component of the Viola- 
Jones object-identification system, used to expedite the 
computation of Haar-like features. A 2D matrix called an 
integral picture holds the whole sum of the pixel intensity 
values in the input image. Each element of the integral 
image is the summation of all the pixels in the input 
image to the top and left of that element, allowing the 
integral image to be effectively computed with just one 
pass over the input image. Because each feature can be 
computed as a simple arithmetic operation on the integral 
image, independent of its size or placement within the 
image, Haar-like features can be computed quickly 
(Amodeo et al., 2022). Overall, the use of integral 
images is a crucial optimization in the Viola-Jones 
technique that enables quick object recognition and 
efficient computation of Haar like features. 
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Figure 6. Types of Haar-like Features used in 
detection of face 


Ada Boost learning for feature selection 

The Viola-Jones object detection framework uses the 
machine learning technique AdaBoost (Adaptive 
Boosting) for feature selection. AdaBoost aims to 
increase the precision and effectiveness of the object 
detection algorithm by choosing a small subset of 
informative features from a large number of available 
features. In the Viola-Jones framework, with each weak 
classifier trained on a single Haar-like feature, AdaBoost 
is used to integrate the weak classifiers into a strong 
classifier. The AdaBoost technique works by reweighting 
the training instances after each iteration in order to focus 
on the poorly classified cases, while iteratively selecting 
the most helpful feature and adding it to the strong 
classifier. In general, AdaBoost learning is an important 
stage in the Viola-Jones object identification framework 
since it enables the selection of instructive Haar-like 
features and enhances the object detection algorithm's 
efficacy and accuracy. 
Cascade of classifiers for fast rejection of windows 
without faces 

The Viola-Jones object detection methodology relies 
on a cascade of classifiers to swiftly eliminate image 
windows that are unlikely to contain a face. Figure 7 
depicts the working of the cascade classifier. This makes 
the method more effective and quick by allowing it to 
concentrate its computation on a limited group of image 
windows that are more likely to contain a face. The 
Viola-Jones framework's classifier cascade consists of a 
succession of steps, each with a set of weak classifiers. 
AdaBoost is used to train the weak classifiers on a set of 
positive and negative examples, and each classifier 
evaluates the input image window using a unique Haar- 
like feature. During detection, the image window is 
passed through each stage of the cascade, with each stage 
using a different set of weak classifiers to evaluate the 
window. If the window fails to pass a stage, it is 
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immediately rejected and the algorithm moves on to the 
next window. 


/ AllSub > 


{ . ) 
\Windows / 
ey —_. 7 he, / Further 


— . 
\ Processing / 


ae 


(Rejected Sub Window 


Figure 7. Working of Cascade Classifier 

Proposed Work 
The objective of the suggested work is to create an 

object detection model that is both efficient (Y-axis) and 
quick (X-axis) to infer. Preliminary results show that 
YOLOv5 efficiently surpasses other state-of-the-art 
methods for this purpose. The aforementioned graph 
demonstrates that all YOLOv5S variations train faster than 
EfficientDet. The fastest YOLOv5 model, YOLOv5x, 
processes images at a rate that is many times quicker 
while maintaining a comparable level of accuracy when 
compared to the Efficient Det D4 model. Figure 10 
shows the performance of all variants of YOLOv5 
algorithm. 
To train the dataset 
ret.frame < read the video frame from webcam 

frame <— resize the video frame 

frm < [frame] takes single video frame 

results — selfmodel (frm) 

labels.cord «store labels and coordinates of 

object detected in the results 


Plot bounding boxes and label onto the frame according 

to the label value for that frame 

Proposed Algorithm 

img <— Read img from webcam 
imgs <— Resize the image 
Convert the image from BGR to RGB format 
face_currentframe — 
face_recognition.face_locations(imgs) 
encode_currentframe — 
face_recognition.face_encoding(imgs,faces_currentfra 
me) 
encode _listknown < for each image in dataset 
face.recognition.find_encoding 
for each image in dataset (encode_currentframe, 
face_currentframe) 
matches <— 
face_recognition.compare_faces(enocde_listknown, 
encode_face) 
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facedis — 
face_recognition.face_distance(encode_listknown,enc 
ode_face) 

matchIndex < np.argmin(face_dis) 
if matches[matchesIndex] then 
create bounding box for displaying the face with label 


Table 4. Notations and descriptions used in the 
proposed algorithm 


Notations Description 


img to read image as input 


imgs store the resized image in proper 


format 


face_currentframe | store location of each face 


encode_currentfra | to store encodings(measurements) of 


me face 


matches store result of matching _ the 


similarities between two faces 


encode_listknown | list of encodings of each face in the 


dataset 

matches store result of matching — the 
similarities between two faces 

facedis to store the Euclidean distance of each 
face encoding 

matchIndex to store the minimum value around 


the axis and name the best match 


Algorithm Used for Face Mask Detection System 
YOLOvS5: (You Only Look Once) 
Process Involve YOLOvS for Face Mask Detection 


FRAME 


J 


FACE MASK IMAGES | 
ENHANCEMENT 


J 


FACE MASK IMAGES 


TRAINING DATASET 


YOLOvS 


SEGMENTATION 
FACE MASK IMAGES WITH MASK 
RECOGNITION 
| NO MASK 
FACE MASK DETECTION 


Figure 6. Flow chart for face mask detection using 
YOLOvS5 


YOLO-Face- Scheme Detection 

The major element of the YOLO-Face scheme's 
detection network YOLOvS is an object identification 
method, is briefly detailed in this section. This method 
enhances the efficacy of face detection technologies 
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dramatically (BattaL et al., 2022). Prediction of classes: 
By only determining whether the two classes face and 
non-face exist or not, this technique reduces the multi- 
label classification problem to a binary classification 
problem (Sapakova et al., 2022). The model, which is 
based on an object detection network, employs a multi- 
part loss function to reduce output error. 

The architecture of the proposed system is depicted in 
Figure 11. In the proposed detection system, there are 24 
convolutional layers before 2 completely connected 
layers. Switching 1x1 convolutional layers help to 
decrease the features region from earlier layers 
(Natarajan et.al.,2022). We pre-train the convolutional 
layers for the ImageNet classification procedure at half 
the resolution (224x224 input image), and then double 
the quality for detection. 

The image's characteristics are extracted by the first 
convolutional layer. While the network's fully connected 
layers anticipate the output's locations and probability, 
this architecture was based on the GoogLeNet model for 
categorizing images. In our network, 24 convolutional 
layers come before 2 completely linked layers. Other than 
Google Net’s inception modules, we only employ 1x1 
reduction layers accompanied by 3x 3 convolutional 
layers. We also create a fast iteration of YOLO with the 
intention of enhancing object identification speed. Instead 
than using 24 convolutional layers, Fast YOLO uses nine 
convolutional layers and fewer filters in each layer. 


Results & Discussion 
Stimulation Setup 

Table 5 demonstrates the system requirements used to 
develop the MCNN. A 64-bit OS and an Intel(R) 
Core(TM) i5 processor with 8 GB of RAM make up the 
hardware setup. The codes were run on Google Colab for 
GPU and were created in Anaconda (jupyterlab) version 
(64-bit). As an operating system, Windows 11 Home is 
used. 


Table 5. Stimulation setup 


Name Specification 


Processor Intel Core 15, 64 
RAM in system 8 Gigabyte 
System Type 64 bit OS 


Software used Anaconda (jupyterlab)(64 bit) 


Setup Google Colab 


OS Windows 11 Home 
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re. 
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(a) Input in m X m orid (b) Bounding boxes (c) Final Detection 


Figure 9. By using the yolo approach, the above image is separated into an m X m grid. Grid 
cell is responsible for detecting the object when any object falls inside it (a). Each grid cell 
predicts the bounding boxes (b) and confidence scores and its class probability (c) is observed 
(Redmon et al., 2015) 
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Figure 10. Performance of all variants of YOIOVS algorithm 
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Figure 11. Architecture of proposed system 
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Results 

Figure 12 displays a plot of precision on training and 
validation datasets over training epochs. The graph's X- 
axis shows how many epochs there have been. The Y- 
axis of the graph contains a representation of the 
precision value. The model's accuracy after 10 iterations 
was 86.78%, while the validation'’s precision was 
82.09%. The model's precision at 20 epochs is 87.94%, 
and its validation is 87.49%. Model accuracy has been 
observed to be 88.27% at 30 epochs, and validation 
precision to be 88.17%. 
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Figure 12. Precision of model on training and 
validating sets 


Figure 13 displays a plot of recall on training and 
validation datasets over training epochs. The graph's X- 
axis shows how many epochs there have been. The Y- 
axis of the graph contains a representation of the recall 
value. The model's recallafter 10 iterations was 
72.78%, while the validation's recall was 73.09%. The 
model's recall at 20 epochs is 77.94%, and its validation 
is 77.49%. Model recall has been observed to be 
81.27% at 30 epochs, and validation recall to be 
82.17%. 
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Figure 13. Recall for training and validating sets 
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Evaluation Technique 

The accuracy of a classification model refers to its 
ability to identify the class labels of a given collection of 
input data. It is described as the proportion of cases that 
were correctly categorized to all of the occurrences in the 
dataset (Srivastava et al., 2021). The proportion of real 
positive examples (i.e., occurrences that fall into the 
positive class) among all cases the model has classified as 
positive indicates how well a binary classification model 
is described by its precision (Li et al., 2022). The ability 
of a binary classification model to properly identify every 
instance of a positive class in a dataset is determined by 
recall. Other names for it include sensitivity and true 
positive rate. The accuracy of Viola Jones for face 
recognition system is 91%. However, the precision and 
recall are 88.17% and 82.1% respectively as shown in 
Table 6. The graphical representation of result (precision 
and recall) is shown in Figure 14. The output for the 
given input is depicted in Table 7. 


Table 6. The value of two parameters Precision and 
Recall of Yolov5 for face mask detection 


Parameters Value 


Precision 88.17 % 
Recall 82.1% 
Classification Report 
100 HB Precision 


HB Recall 


Percentage 


Figure 14. The graphical representation of the of the 
precision and recall of the proposed work 
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Table 7. The output obtained from the proposed work 


Face Recognition 


Face Mask Detection(Without Mask) 


With Mask 


Improper Lightening 


Improper wearing of Mask 
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Comparative Analysis 


Table 8. Comparison of algorithms 


Also Pre 0 Re 


SVM ( Rana et al.,2016) 72.7 70 
PCA ( Grogan et al.,2021) 72 73 
KNN ( Schenkel et al.,2019) 82.7 68.50 
PROPOSED WORK | 88.17 82.1 
(YOLOV5) 


Table 8 compares different algorithms depending. 
Figure 15 shows the graphical representation of the 
comparison of the algorithms. It has been found that 
YOIOv5 gives improved results as compared to other 
state-of-art algorithms. For this, we have taken images as 
well as videos to extract the precision. From this we have 
seen that YOLOvS5 gives a high precision percentage 
rather than SVM which has a precision of 72.7% and a 
recall value is 70 percent (Rana et al., 2016), Principal 
Component Analysis (PCA) which has a precision of 
72% and a recall value of 73% (Grogan et al., 2021), 
KNN has a precision value of 82.7% and recall value of 
68.50% (Schenkel et al., 2019). While the proposed work 
has precision and recall values are 88.17 and 82.1 
respectively. 
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Figure 15. The graphical representation of the 
precision and recall of proposed work (YOIOvS) in 
comparison with the most likely used algorithms 


Conclusion 

In this study, multiple picture recognition algorithms 
were evaluated using the same data set, and the accuracy 
Face 
recognition can be utilized to accomplish improbable and 
difficult applications after it has been accomplished 
effectively in a variety of settings. Unknown people can 
be identified using this approach. Future studies will 
focus on algorithm recognition. The data collection 


rate was measured on a number of folds. 
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such as 
photos with or without light, with or without spectacles, 
or with or without a smile. Using the provided data set, 
the Algorithm has produced various accuracy rates with 
regard to various folds. 

The study covers a variety of algorithms by Viola 
Jones that could be used to build a facial recognition 
system. We've evaluated a few of them and talked about 
their advantages and disadvantages. This method 
recognizes facial expressions to a respectable degree and 


includes images with different variations, 


can therefore be applied to security applications. In the 
results section, the accuracy rate has been compared and 
examined. The graphs and result tables make it obvious 
that Viola Jones highest accuracy (91%) when compared 
to all other classifiers. 

This paper proposes is based on YOLOVS for 
application to recognize faces whether wearing masks or 
not, they only need to stand in front of the camera, people 
can be identified, if recognition success and an alert 
message will appear which will show whether they are 
allowed to enter the room or not, this approach is no 
longer required to use the human crowd control, greatly 
saving time and waste. This experiment has a success rate 
of about precision (88.17) %and recall (82.1%) and for 
comparison, we use some other traditional machine 
learning models. There is a photo of the person who is 
well known who is also wearing a mask but not covering 
their nose. We think that this design can successfully 

and execute effective 
COVID-19's — worldwide 
influence. Though testing our success rate is 85.5%. 


reduce exposure distances 
surveillance because’ of 
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